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Abstract — We present a practical algorithm to decode 
erasures of Reed-Solomon codes over the q elements binary 
field in 0(q log^ q) time where the constant implied by the 
O-notation is very small. Asymptotically fast algorithms 
based on fast polynomial arithmetic were already known, 
but even if their complexity is similar, they are mostly 
impractical. By comparison our algorithm uses only a few 
Walsh transforms and has been easily implemented. 

I. Introduction 

A linear error-correcting code of dimension k and 
block length n over a finite field F q is a ^-dimensional 
linear subspace of the space F™. Elements of this sub- 
space are called codewords. A linear code also comes 
with an encoding function that maps in a unique way an 
element of F^ (the message) into a codeword. By erasure 
decoding, we mean the task of recovering the message 
knowing only a subset of the coordinates of its encoding. 

Of course, if we know fewer than k coordinates of 
a codeword, there is always more than one possible 
corresponding message. Thus, a code is optimal with 
respect to recovering erasures if given any subset of k 
coordinates of a codeword, there is only one possible 
corresponding message. Such a code is called maximum 
distance separable (MDS) code. A standard and famous 
class of MDS codes is given by Reed-Solomon codes 
[4]. Such a code is obtained by evaluating polynomials 
over F g with degree less than k at n different points in 
F g . Their length n is thus bounded by the number of 
points in F g , that is q. 

Decoding erasures of any linear code can be done by a 
simple Gaussian elimination with 0{k i ) operations. For 
Reed-Solomon codes, classical algorithms can decode 
in 0(k 2 ) and encode in O(kn). Theoretically, using fast 
polynomial arithmetic [1], we can encode and decode 
them in 0(n log 2 n log log n) [4, p. 369]. However, the 
algorithms involved are complex and there is a large 
constant hidden in this asymptotic complexity. Hence, 
from a practical point of view, only the quadratic time 
algorithms were useful, see for instance [2]. 



In this paper, we present a practical algorithm that 
can decode erasures of a Reed-Solomon codes over F q 
with q = 2 m in 0{q\o^q) time. It uses 0(q\og 2 q) 
memory, but we also have a version using 0{q) memory 
with complexity 0{q\og\q). Here, the field operations 
are counted as 0(1) and the memory to store one field 
element is counted as O(l) too. 

This algorithm is simple (our full C implementation 
is less than 500 lines of code) and has a very small 
constant in its complexity. It uses the Walsh transform 
instead of the discrete Fourier transform used in previous 
asymptotically fast algorithms. This is possible because 
we actually never compute the coefficients of the in- 
volved polynomials, but just manipulate their Lagrange 
form at the points we received. 

Notice that we can use the same algorithm to encode 
Reed-Solomon codes in a systematic way by choosing 
the first k positions of a codeword and then erasure 
decoding. Notice as well that the complexity does not 
depend on k or n, so we better use Reed-Solomon codes 
of length close to q and a dimension k of the same order. 
In the case where only a few systematic symbols are 
missing, we can also decode in 0(q\og 2 q) operations 
plus 0{k) operations per missing symbol. 

Erasure codes with faster encoding and decoding 
complexity exist [3]. Such codes are binary codes and 
thus cannot be MDS, that is they require a little more 
than k symbols to be able to recover the message. Due 
to their low complexity and binary nature, these codes 
have many practical applications, but in some situations 
it may be better to use the classical Reed-Solomon codes. 
This is in particular the case when high rate codes are 
needed or when one prefers not to waste any redundancy 
at the price of a slightly higher complexity. 

The organization of the paper is quite straightforward. 
We start by presenting our algorithm outline before 
discussing in detail its two main steps. We also recall 
on the way basic facts about the Walsh transform. 



II. Algorithm outline 



III. Computing Lagrange coefficients 



We start by fixing some notation. We will mainly work 
on the binary field F 2 ™ with q elements. We will use 
the notation © for the addition in this field to avoid 
confusion with the normal addition. Seeing this field 
as an m dimensional space over F2, we represent its 
elements as binary vectors x of length m. 

The codewords of a Reed-Solomon code of dimension 
k over F2™ are in one-to-one correspondence with the 
polynomials of degree less than k over F 2 ™. Given 
such a polynomial P, we will take for its corresponding 
codeword the evaluation of P at all the points x of F 2 ™ , 
that is the image vector (P(x),x G F 2 m). It is of course 
possible to use smaller length, but our algorithm will 
recover the full image vector anyway. 

We will always order the points x = (xi, . . . ,x m ) 
of F2" by lexicographical order over the binary vector 
(xi, . . . , x m ). Moreover, for any function F defined over 
F 2 ™ , we will write [F] for its image vector over all the 
q points of F 2 ™ ordered by this order. 

Suppose now that we see only k points (or more) of a 
given vector [P], we will show that we can then compute 
all the points of this vector in 0(qlog 2 q) time. This 
operation is sufficient to both encode and decode the 
code. To encode in a systematic way, we can set the first 
k positions of the vector [P] as we want and then use 
the algorithm to compute the parity symbols. To decode, 
we will just reconstruct the vector from any k symbols 
and read the encoded information at the beginning of the 
vector. 

We will write R C F2"> for a set of k positions among 
the received ones. Since P is a polynomial of degree 
less than k, it is uniquely determined by its values at k 
points of the field F 2 m . Using the well known Lagrange 
interpolation formula, we have: 

Definition 1 (Lagrange Form): Given the values of a 
polynomial P of degree less than k on a subset R of k 
points in F2™, its Lagrange form is 

P(x) = c u J] (x©y) (1) 

ugi? y eR,y=£u 

where the c u are the Lagrange coefficients and belong to 
F 2m . 

Our algorithm works in two steps that we will detail 
in the next two sections. The first step is to compute 
the coefficients c u and it runs in time 0(qlog 2 q). The 
second step is to evaluate the Lagrange form of P at 
all the points of F 2 ^ and runs in time 0(qlog 2 q). Both 
steps rely heavily on Walsh transform computation. 



Theorem 1: Given the values at k points of a polyno- 
mial of degree less than k over F 2 ™, we can compute 
its Lagrange coefficients at theses points in 0(q\og 2 q) 
time and 0(q) memory. 

If we apply the formula (Q]) at a point x € R where 
we know P(x), we have 

P(x) = c x J[ (xffiy) Vxei?. (2) 

The main difficulty to compute the coefficients c x is then 
to evaluate the product, that is 

n(x) := H (x © y) . (3) 

We define a function R from F 2 ™ into {0, 1} such that 
.R(x) = 1 if and only if x £ R. We can see this function 
as the indicator function of the received positions. With 
this definition, we can rewrite the product (f3]) as 

LT(x)= \{ (xffiy)^). (4) 

By fixing a primitive element oc of the multiplicative 
group F 2 m, we can also define the discrete logarithm 
function L : F| m — > [0, q — 1] such that L(x) = i if and 
only if x = a 1 . If we extend this function to F 2 ™ by 
setting L(0) := we have 

L(n(x))= Y, R(y)H*®y) • (5) 

y€F 2 ™ 

and more importantly a L ^ n ^ = il(x) for all x in 
F2">. This is because n(x) is never equal to zero. For 
someone familiar with the Walsh transform, it is now 
easy to notice that the value of the expression ([5]) can be 
computed for all x in 0(q\og 2 q) operations. Of course 
we first have to compute the vector [L] but this can 
be done in linear time and we can even precompute its 
Walsh transform. 

Since the Walsh transform plays a key role in this 
paper, we recall how it works. The Walsh transform R 
of a function R : F2". — > Z is a linear transform defined 
by 

Six) = R (y)(~ l ) x ' y ■ w 

yeF 2 ™ 

Moreover, we have the inverse formula 

R(x) = -R(x) (7) 



and for two functions R and L from F2" 
define the convolution product * by 



into Z, if we 



yeF 2 m 



i?(y)L(xey) 



we have 



R*L = RL 



(8) 



(9) 



So we can compute (fS) by performing three Walsh trans- 
forms. Finally, the Walsh transform can be computed 
efficiently in 0(qlog 2 q) by working on the image vector 
[R] of a function thanks to the induction relation 



[R] — [Rq — Ri I Rq + Ri] 



(10) 



where both Rq and R± are functions from F 2 ™-i into 
Z defined by R (x 1 , . . . , x m -i) := R(Q, x\, . . . ,x m -i) 
and Ri(xi, . . . ,x m -i) := R(l,xi, . . . ,x m _i). This al- 
gorithm is known as the fast Walsh transform. 

We remark that for computing the Lagrange coeffi- 
cients, we are only interested in the values modulo q — 1. 
Since q is equal to 1 modulo q — 1 we can perform all 
the above computation modulo q — 1. With this tweak, 
the Walsh transform becomes involutive, and we do not 
need more than m bits per value. 

IV. Evaluating a Lagrange form 

Theorem 2: Given the Lagrange form of a polynomial 
P over F 2 ™, we can compute its image vector [P] in 
0(qlog 2 q) time using 0{q\og 2 q) memory. 

The idea to achieve this complexity start by rewriting 
the Lagrange form (0Q) of P as 

P(x) = n(x)0-|- vx^i?. (ii) 

To write this more conveniently, let us define an inverse 
function / : ¥ 2 ™ — ► F2"> that maps x to x _1 and to 
0. We define as well a coefficient function C from ¥ 2 ™ 
into F2™ that maps x to c x if x € R and to otherwise. 
We then have 

P(x) =n(x) C(y)/(xey) • (12) 
yeF 2 ™ 

We already computed in the previous section the vector 
[LT], so the only work left is to compute the sum on 
the right. This really looks like the convolution product 
defined in the last section except the functions now 
take their values in ¥ 2 ™ and not in Z. To overcome 
this difficulty, we are going to look at the vectorial 
representation of the elements of ¥ 2 ™ over F2. 

We will need some more notation. For i € {1, . . . , m}, 
we will write for the i-th elementary basis element of 



the vectorial space F2"« over F2. For a function C from 
F2™ into ¥ 2 ™, we will write Cj for its i-th component, 
that is the function that maps x into the coefficient of 
in C(x). We can now rewrite (fT2l as 

m m 

P(x) = n(x)00e jei Q(y)/,(xey) . (13) 

i=\ j = l yGF 2m 

Now, the sum over y that we have to compute m 2 times 
can be computed in 0(q log 2 q) time. This is because we 
now have Boolean functions. We can thus see them as 
functions in Z, compute 

"1 
-( 

.1 

using the fast Walsh transform and take the parity of 
each values in the resulting vector. That is, we have 



-Cilj 



(14) 



(15) 



P(x)=n(x)00e i e j parity I 

i=l j=x 

Hence, we can evaluate the Lagrange form at all the 
points of F2™ in 0(q\og\q) operations. In order to be 
faster, we will choose e$ = a 1 for a primitive element 
cx of F 2 m . We can then rewrite the formula (fl"5l) as 

2(m-l) , s 

P(x) = n(x) a s parity I -a^(x)J . (16) 

s=0 i+j=s ' 

Now, by using the linearity of the parity function and of 
the Walsh transform, we get 

2(m-l) 

Plx) =n(x) a s parity( - ^ QIj (x) ] . (17) 

s=0 \ i+j=s 



If we compute and store the [d] and the [Ij] (which 
require 0(q\og 2 q) memory and 0{q\og\ q) operations), 
we can evaluate this formula in 0{q\og\ q). Indeed, we 
have to compute 2(m — 1) Walsh transforms of functions 
which have an image vector that can be computed in 
0{q\og 2 q). 

We remark that since we only need the parity of the 
result, we can perform the Walsh transforms by keeping 
only the m + 1 lower bits of all the values. Notice as 
well that we can precompute the [Ij]. 

If the number of erased systematic symbols is small, 
an alternative decoding method is to evaluate £[]) directly 
at these erased positions. So we still use 0{q\og 2 q) 
operations to compute the coefficients c u but then it is 
only 0{k) operation per erased message symbols. 

Finally, an interesting fact is that k plays only a 
minor role in the implementation. We can just use all 



the available points to decode and if we received more 
symbols than the degree of the polynomial sent, we will 
just recover the sent vector. 

V. CONCLUSION 

We presented a practical and fast algorithm to encode 
and decode Reed-Solomon codes over the erasure chan- 
nel. This algorithm should be more efficient in software 
than the previous algorithms as soon as the size of the 
field is 2 10 , or even smaller. 

Moreover, it allows the use of long Reed-Solomon 
codes which may have some applications. For example, 
our implementation can encode and decode on F216 in 
less than a second or on F220 in a few seconds (on an 
Intel core 2, 1.86GHz). One may wonder where we can 
use such codes, but they may be worth considering, for 
example, to recover from failure on a storage system or 
to send big files over the Internet. 

Finally, we considered only the case of a binary 
field. The same approach, namely computing Lagrange 
coefficients and evaluating the Lagrange form directly is 
certainly applicable on any field. However, generalizing 
the way we use Walsh transform is not straightforward 
and it seems unlikely that it will lead to a practical 
algorithm. 
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